Weighted Sup-Norm Contractions in Dynamic Programming: A Review and Some New Applications
نویسنده
چکیده
We consider a class of generalized dynamic programming models based on weighted sup-norm contractions. We provide an analysis that parallels the one available for discounted MDP and for generalized models based on unweighted sup-norm contractions. In particular, we discuss the main properties and associated algorithms of these models, including value iteration, policy iteration, and their optimistic and approximate variants. The analysis relies on several earlier works that use more specialized assumptions. In particular, we review and extend the classical results of Denardo [Den67] for unweighted sup-norm contraction models, as well as more recent results relating to approximation methods for discounted MDP. We also apply the analysis to stochastic shortest path problems where all policies are assumed proper. For these problems we extend three results that are known for discounted MDP. The first relates to the convergence of optimistic policy iteration and extends a result of Rothblum [Rot79], the second relates to error bounds for approximate policy iteration and extends a result of Bertsekas and Tsitsiklis [BeT96], and the third relates to error bounds for approximate optimistic policy iteration and extends a result of Thiery and Scherrer [ThS10b]. † Dimitri Bertsekas is with the Dept. of Electr. Engineering and Comp. Science, M.I.T., Cambridge, Mass., 02139. His research was supported by NSF Grant ECCS-0801549, and by the Air Force Grant FA9550-10-1-0412.
منابع مشابه
Modern Computational Applications of Dynamic Programming
Computational dynamic programming, while of some use for situations typically encountered in industrial and systems engineering, has proved to be of much greater significance in many areas of computer science. We review some of these applications here.
متن کاملAn equivalent representation for weighted supremum norm on the upper half-plane
In this paper, rstly, we obtain some inequalities which estimates complex polynomials on the circles.Then, we use these estimates and a Moebius transformation to obtain the dual of this estimates forthe lines in upper half-plane. Finally, for an increasing weight on the upper half-plane withcertain properties and holomorphic functions f on the upper half-plane we obtain an equivalentrepresenta...
متن کاملTwo Equivalent Presentations for the Norm of Weighted Spaces of Holomorphic Functions on the Upper Half-plane
Introduction In this paper, we intend to show that without any certain growth condition on the weight function, we always able to present a weighted sup-norm on the upper half plane in terms of weighted sup-norm on the unit disc and supremum of holomorphic functions on the certain lines in the upper half plane. Material and methods We use a certain transform between the unit dick and the uppe...
متن کاملSome inequalities involving lower bounds of operators on weighted sequence spaces by a matrix norm
Let A = (an;k)n;k1 and B = (bn;k)n;k1 be two non-negative ma-trices. Denote by Lv;p;q;B(A), the supremum of those L, satisfying the followinginequality:k Ax kv;B(q) L k x kv;B(p);where x 0 and x 2 lp(v;B) and also v = (vn)1n=1 is an increasing, non-negativesequence of real numbers. In this paper, we obtain a Hardy-type formula forLv;p;q;B(H), where H is the Hausdor matrix and 0 < q p 1. Also...
متن کاملInterval Weighted Comparison Matrices – A Review
Nowadays, interval comparison matrices (ICM) take an important role in decision making under uncertainty. So it seems that a brief review on solution methods used in ICM should be useful. In this paper, the common methods are divided into four categories that are Goal Programming Method (GPM), Linear Programming Method (LPM), Non-Linear Programming Method (NLPM) and Statistic Analysis (SA). GPM...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012